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In the standard model (SM), top quarks are expected 
to be produced singly in pp collisions through s-channel 
or t-channel exchange of a virtual W boson [l[ . The rea- 
sons for studying single top quarks are compelling: the 
production cross section is directly proportional to the 
square of the CKM matrix Q element |Vt&|, and thus 
a measurement of the rate constrains fourth-generation 
models, models with flavor-changing neutral currents, 
and other new phenomena Q . Electroweak production of 
single top quarks is a difficult process to measure because 
the expected expected production cross section for the 
combined s- and t-channels (er sf ~ 2.9 pb 0, Q) is much 
smaller than those of competing background processes, 
and it is also smaller than the uncertainty on the total 
background rate. The presence of only one top quark in 
the event provides fewer features to use in separating the 
signal from background, compared with measurements 
of top pair production (tt) , which was first observed in 
1995 @. 

To overcome these challenges, a variety of multivariate 
techniques for separating single top events from the back- 
grounds have been developed. Using different combina- 
tions of techniques, both the CDF and DO collaborations 
have published evidence for single top quark production 
at significance levels of 3.7 and 3.6 standard deviations, 
respectively 0, HJ . The analysis described in this Letter 
supersedes that of Ref. and achieves a significantly 
improved sensitivity by including a larger data sample 
and by adding three new analyses. We report a signal 
significance of 5.0 standard deviations, thus conclusively 
observing electroweak production of single top quarks, 
and we make the most precise measurement of |Vt&| to 
date. 

We assume that single top quarks are produced in the 
s- and i-channel modes with the SM ratio, and that the 
branching ratio of the top quark to Wb is 100%. We 
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seek events in which the W boson decays leptonically 
in order to improve the signal-to-background ratio s/b. 
We simulate single top events using the tree-level matrix- 
element generator MADEVENT Q. The t-channel signal 
is modeled by the two processes qb — » q't and qg — > 
q'tb, which are combined to match the event kinematics 
predicted by a fully differential NLO calculation [B], [l(| • 

A total of six analyses are combined to yield the fi- 
nal results reported here. The likelihood function (LF), 
matrix element (ME), and neural network (NN) analyses 
of are re-used with an additional 1 fb _1 of integrated 
luminosity; their methods remain unchanged. The three 
new analyses introduced here are: a boosted decision tree 
(BDT), a likelihood function optimized for s-channel sin- 
gle top production (LFS), and a neural-network-based 
analysis of events with missing transverse energy $t 11 \ 
and jets (MJ). The BDT and LFS analyses use events 
that overlap with the LF, ME, and NN analyses, while 
the MJ analysis uses an orthogonal event selection that 
adds about 30% to the signal acceptance. This paper 
concentrates on the three new analyses and their combi- 
nation with the analyses of using 3.2 fb _1 of integrated 
luminosity collected with the CDF II detector [T3|. 

For the LF, ME, NN, BDT, and LFS analyses we se- 
lect I + Ut + jet events as described in [7J], where I 
is an explicitly reconstructed electron or muon from the 
W boson decay and at least one jet is identified as con- 
taining a B hadron. The background has contributions 
from events in which a W boson is produced in associa- 
tion with one or more heavy-flavor jets (W + HF), events 
with mistakenly 6-tagged light-flavor jets (mistags), mul- 
tijet events (QCD), tt and diboson processes, as well as 
Z+jet events. The expected event yields in Table Q] are 
estimated as in where the signal, ti, and diboson cate- 
gories are Monte Carlo (MC) predictions scaled to the to- 
tal integrated luminosity while the remaining categories 
use predictions derived from data control samples. The 
uncertainties quoted in Table |l] include theoretical uncer- 
tainties, the luminosity uncertainty for the MC predic- 
tions, and experimental uncertainties for the data-driven 
background normalizations. 

The MJ analysis is designed to select events with $t 
and jets and to veto events selected by the i +_Er+jet 
analyses. It accepts events in which the W boson decays 
into r leptons and those in which the electron or muon 
fails the lepton identification criteria. We use data cor- 
responding to 2.1 fb _1 of integrated luminosity for the 
MJ analysis and select events that have fir > 50 GeV 
and two jets within |?/| < 2.0, at least one of which has 
77I < 0.9. The jet energy measurements include informa- 
tion from both the calorimeter and the charged-particle 
spectrometer. Events must have one jet with transverse 
energy Et greater than 35 GeV, and a second jet with 
Et greater than 25 GeV. The angular separation between 
the two jets, AR = \J (Arj) 2 + (Acj)) 2 , is required to ex- 
ceed 1.0. We reject events with four or more jets with 
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Et > 15 GeV in |7y| < 2.4 in order to reduce the multi- 
jet (QCD) and ft backgrounds. We identify b jets with 
the same algorithm used in 0] supplemented with a jet 
probability algorithm (l3j |. 

The primary background in the MJ analysis is 
QCD events in which mismeasured jet energies produce 
large Mr aligned in the same direction as jets. To re- 
duce this background, we use the transverse momentum 
imbalance (j/t) as measured in the spectrometer. This 
variable is more correlated to the neutrino energy and its 
direction than 3r m this class of events. The absolute 
amount of $t and ^t, the angle between them, the az- 
imuthal angles between Et orpr and the jet directions, 
and several other less powerful variables are used as in- 
puts to a neural network (NNQCD). The NNQCD output 
is required to pass a threshold, removing 77% of the QCD 
background while keeping 91% of the signal acceptance. 

The backgrounds in the MJ analysis due to QCD 
events and events with light-flavor jets produced in asso- 
ciation with W and Z bosons are estimated using data 
in a control region composed of events in which the St is 
aligned with one of the jets. The observed and expected 
event counts for the MJ analysis are given in the-Er+jets 
column of Table Q] 



TABLE I: Background composition and predicted number of 
single top events in 3.2 fb _1 of CDF Run II data for the 
i+SSr+jets samples (LF, ME, NN, and BDT analyses), and 
2.1 fb _1 of data for the i^r+jets sample (MJ analysis). 



Process 



+ $ T + jets %!t + jets 



s-channel signal 77.3 ± 11.2 29.6 ± 3.7 

f-channel signal 113.8 ± 16.9 34.5 ± 6.1 

1551.0 ± 472.3 304.4 ± 115.5 

686.1 ± 99.4 184.5 ± 30.2 

52.1 ± 8.0 128.6 ± 53.7 

118.4 ± 12.2 42.1 ± 6.7 

777.9 ± 103.7 679.4 ± 27.9 



W + HF 

ti 

Z+jets 
Diboson 
QCD+mistags 



Total prediction 3376.5 ± 504.9 1404 ± 172 



Observed 



3315 



1411 



After event selection, the samples are dominated by 
background. We further discriminate the signal with 
multivariate techniques. Each multivariate technique 
defines a function which reduces several reconstructed 
quantities for each event into a single output variable 
whose distribution can be studied and fit to extract signal 
and background contributions. Validation of the back- 
ground modeling for the input variables and output dis- 
tributions is a crucial step in the use of multivariate tech- 
niques. We first describe the construction of our multi- 
variate tools and then the checks we used to prove the 
validity of our background model. The LF, ME, and NN 
discriminants are described in Q ■ The BDT discriminant 



uses a decision tree method that applies binary cuts iter- 
atively to classify events 14|. The discrimination is fur- 
ther improved using a boosting algorithm [H, [lij . The 
BDT discriminant uses over 20 input variables. Some of 
the most sensitive are the neural-network jet-flavor sepa- 
rator 17J, the invariant mass of the Ivb system Af&,&, the 



total scalar sum of transverse energy in the event ifr, 
Q x rj 18], the dijet mass Mjj, and the transverse mass 
of the W boson. 

The LFS discriminant uses projective likelihood func- 
tions [l9l | to combine the separation power of several vari- 
ables and is optimized to be sensitive to the s-channel 
process. The subset of the ^-KEr+jets sample with two 
6-tagged jets is used and consists of 609 events. The dom- 
inant backgrounds are W + HF and ti production. A 
kinematic fitter is used to find the most likely resolution 
of two ambiguities: the z-component of the neutrino mo- 
mentum and the b jet that most likely came from the top 
quark decay. In addition to the outputs of the kinematic 
fitter, other important inputs to the likelihood are the in- 
variant mass of the two 6-tagged jets Mtb, the transverse 
momentum of the bb system, the leading jet transverse 
momentum, M^i,, Ht, and-Er- 

The MJ discriminant uses a neural network to com- 
bine information from several input variables. The most 
important variables are the invariant mass of the $t and 
the second leading jet, the scalar sum of the jet energies, 
the E"t , and the azimuthal angle between the Mr and the 
jets. 

We combine the LF, ME, NN, BDT, and LFS chan- 
nels using a super-discriminant (SD) technique similar 
to that which was applied in [7j. The SD method uses a 
neural network trained with neuro-evolution [2(| to sep- 
arate the signal from the background taking as inputs the 
discriminant outputs of the five analyses for each event. 
With the super-discriminant analysis we improve the sen- 
sitivity (defined below) by 13% over the best individual 
analysis. We perform a simultaneous fit over the two 
exclusive channels, MJ and SD, to obtain the final com- 
bined results. 

Before investigating the sample of selected events, 
we used background-dominated data control samples to 
check the modeling of each input variable as well as the 
output distributions of each multivariate discriminant. 
For the £ + $t + jets analyses the control samples used 
are the lepton + 6-tagged four-jet sample, which is en- 
riched in ti events, and the two- and three-jet samples in 
which there is no 6-tagged jet. The latter are enriched in 
W+jets and QCD events with kinematics similar to the b- 
tagged signal samples and have high statistics, making it 
possible to observe that the background model describes 
the data well over three orders of magnitude in our out- 
put discriminants. For the MJ analysis, three control 
samples are used: in the first sample, the $t is required 
to be aligned along one of the jets, and in the second, the 
events are required to fail the NNQCD requirement, and 
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in the third, a lepton is required to be present. The data 
distributions in all control samples are described well by 
our models for each of the analysis input variables and a 
large set of other variables not used as inputs. More than 
two thousand distributions were checked for evidence of 
mismodeling. Small discrepancies were found in the dis- 
tributions of the angles between two jets in the untagged 
lepton + two-jet sample and the modeling of jets with 
rapidity greater than 2.4. These effects are included as 
systematic uncertainties on the shape of the background 
models. 

Figure Q] shows the distributions of the five 
I + f£r + jets discriminants. These are combined to give 
the SD distribution shown in Fig.[2]together with the MJ 
distribution. In the rightmost bins, assuming SM produc- 
tion and decay, the SD has an s/b that exceeds 5.0. This 
large s/b significantly reduces our sensitivity to system- 
atic uncertainties affecting the background. We use the 
distributions of the SD and MJ discriminants to extract 
the measured cross section and the signal significance. 

We measure the single top cross section using a 



Bayesian binned likelihood technique 2lJ assuming a 
flat prior in the cross section and integrating the pos- 
terior over all sources of systematic uncertainty. The 
background rates are varied within uncertainties, but 
are largely constrained by the data in the background- 
enriched portions of the SD and MJ discriminant dis- 
tributions. Uncertainties on the shapes of these distri- 
butions degrade the extrapolation of these constraints 
to more signal-like regions. The sources of systematic 
uncertainties affecting these shapes are discussed below 
and are also included in all calculations. The uncer- 
tainties assigned were conservatively chosen to cover the 
full range of variations studied. We quote the measured 
cross section as the value that maximizes the posterior 
likelihood, and use the shortest interval containing 68% 
of the integral of the posterior to set the uncertainties. 
We calculate the significance as a p- value 2l|, which is 
the probability, assuming single top quark production is 
absent, that — 21nQ = — 2 In (p(data|s + 6)/p(data|6)) is 
less than that observed in the data. Figure l^c) shows 
the distributions of — 21nQ in pseudoexperiments that 
assume SM single top (S + B) and also those that as- 
sume single top production is absent (£?), along with the 
value observed in data. The effects of the systematic un- 
certainties are included in the pseudoexperiments. We 
convert the observed p- value into a number of standard 
deviations using the integral of one side of a Gaussian 
function. 

All sources of systematic uncertainty are included 
and correlations between normalization and discriminant 
shape changes are considered. Uncertainties in the jet 
energy scale, 6-tagging efficiencies, lepton identification 
and trigger efficiencies, the amount of initial and final 
state radiation, parton distribution functions, factoriza- 
tion and renormalization scale, and background model- 



ing have been explored and incorporated in all individual 
analyses and the combination. We include uncorrelated 
MC statistical uncertainties in each bin of each discrim- 
inant distribution. A ±2.5 GeV/c 2 uncertainty on the 
top quark mass m t is included in the significance and 
\Vtb\ results but quote the dependence on mt separately 
in the cross section. 



TABLE II: Results summary for the five correlated £+$T+]ets 
analyses combined by the SD analysis, the SD and the MJ 
analysis, and the total combination. The LFS analysis mea- 
sures only the s-channel production cross section, while the 
other analyses measure the sum of the s- and i-channel cross 
sections. 



Analysis 


Cross 


Significance 


Sensitivity 




Section (pb) (Std. Dev.) (Std. Dev.) 


LF 


1 6 +0 - 8 


2.4 


4.0 


ME 


2 5 +a7 


4.3 


4.9 


NN 


1 Q+0-6 


3.5 


5.2 


BDT 


2 l +0 - 7 


3.5 


5.2 


LFS 


1 r+0.9 
1 -°-0.8 


2.0 


1.1 


SD 


n 1+0.6 

z - A -o.s 


4.8 


> 5.9 


MJ 




2.1 


1.4 


Combined 


9 n+0.6 
z '°-0.5 


5.0 


> 5.9 



Table Qj] lists the measured cross sections and sig- 
nificances for each of the component analyses and the 
combination. The measured cross sections for the five 
correlated analyses and the SD are close to each other 
even though the analyses choose different input variables 
and are optimized differently. We interpret the excess 
of signal-like events over the expected background as 
observation of single top production with a p-value of 
3.10 x 10 -7 , corresponding to a signal significance of 
5.0 standard deviations. The sensitivity is defined to 
be the median expected significance and is in excess of 
5.9 standard deviations, assuming the SM signal cross 
section. The most probable value of the combined s- 
channel and i-channel cross section is 2.3^g'g pb assum- 
ing a top quark mass of 175 GeV/c 2 . The dependence 
on the top quark mass is +0.02 pb/(GeV/c 2 ). From the 
cross section measurement at m t = 175 GeV/c 2 , we ob- 
tain \V tb \ = 0.91±0.11(stat + syst)±0.07(theoryQ) and 
limit \V t b\ > 0.71 at the 95% C.L. assuming a flat prior 
in \Vtb\ 2 from to 1. This is the most precise direct 
measurement of \Vtb\ to date. 

In summary, we combine six multivariate analysis tech- 
niques to precisely measure the electroweak single top 
production cross section and the CKM matrix element 
\Vtb\- We have carefully cross-checked our analysis tech- 
niques with data control samples and we assign generous 
rate and shape uncertainties to all predictions we use. 
Our combined discriminant allows us to purify a signal 
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BDT Discriminant LFS Discriminant 



FIG. 1: Discriminant distributions for the ^+2^r+jets analyses. The data are indicated with points, and the predictions are 
shown separately for each contribution with stacked histograms. The signal expectations shown are the SM predictions. The 
insets show the distributions of the candidate events in the high-discriminant region. 




0.2 0.4 0.6 0.8 1 -1 -0.5 0.5 1 -200 



Super Discriminant MJ Discriminant Test Statistic [-2ln(Q)] 

FIG. 2: Discriminant distributions for the (a) SD, and (b) MJ analyses (see Fig. [T] for their caption and legend). Figure (c) 
shows the distribution of the likelihood ratio test statistic — 21nQ. 



sample with s/b > 5.0 in the most sensitive region, al- 
lowing for a significant outcome in the presence of these 
conservative systematic uncertainties. We observe single 
top quark production for the first time with a significance 
of 5.0 standard deviations. 
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